State our null and alternative hypothesis
Identify the distribution and compute the degrees of freedom
Find the critical value using what we know in step 2
Compute student’s t test or our ANOVA (we’ll get here this week)
Accept or reject our null hypothesis
We use a one-tailed if our hypothesis involves one direction {e.g.,., greater than, less than, bigger, smaller, etc.).
We use a two-tailed (most common) if we are looking into differences between two group without predicting the direction.
image: toward data science
A measure of dispersion obtained by pooling multiple sample data sets (in our case two) into one large data set to calculate an grouped standard deviation.
\[\sigma_p = \sqrt{\frac{{(n_1 - 1) \cdot \sigma^2_1 + (n_2 - 1) \cdot \sigma^2_2}}{{n_1 + n_2 - 2}}}\] \(n\) = sample size
\(\sigma\) = standard deviation
We use this value to compare with our critical value in order to accept or reject our null.
\[t = \frac{{\bar{X}_1 - \bar{X}_2}}{{\sigma_p \sqrt{\frac{1}{n_1} + \frac{1}{n_2}}}}\] \(\bar{x}\) = Sample mean
\(n\) = Sample size
\(\sigma_p\) = pooled standard deviation
A test that is used to compare the means of two groups, where the independent variable is categorical and our dependent variable is continuous.
Helps determine if the differences between means are reliably different or just due to chance.
We use the t-distribution here and we calculate the degrees of freedom by:
\[df = (n_1 + n_2) - 2\]
Suppose we have two groups of students - Group A and Group B. We want to know if there is a significant difference (95% or 0.05) in their exam scores.
Group A
n = 20
\(\bar{x}\) = 75
\(\sigma\) = 10
Group B
n = 20
\(\bar{x}\) = 80
\(\sigma\) = 12
We use a t-distribution (p. 373) and we calculate the degrees of freedom to find our critical value.
\((n + n)- 2\)
\((20 + 20) - 2 = 38\)
\[\sigma_p = \sqrt{\frac{{(n_1 - 1) \cdot \sigma^2_1 + (n_2 - 1) \cdot \sigma^2_2}}{{n_1 + n_2 - 2}}}\]
\[t = \frac{{\bar{X}_1 - \bar{X}_2}}{{\sigma_p \sqrt{\frac{1}{n_1} + \frac{1}{n_2}}}}\]
Our critical value was \(\pm\) 2.021.
If the t-value is greater than the critical t-value, we reject the null hypothesis and conclude that there is a significant difference between the means of the two groups.
\(|-1.43| < 2.021\)
\(1.43 < 2.021\)
Therefore, 🥁 drum roll 🥁…
We fail to reject the null hypothesis!
There is not enough evidence to conclude that there is a meaningful difference in the mean exam scores between Group A and Group B.
Used to examine a meaningful relationship between the means of more than two groups.
The independent variable must be categorical (e.g., ) and the dependent variable continuous.
Between-group variance
Measures the extent each group is similar or different from other groups within the sample
The larger the variance, the more likely the groups are distinct from each other
Within-group variance
Measures the spread of data within each group
Considered to be the error or residual
Focuses on how much the values within deviate from the group’s mean
\[\bar{X}_{\text{overall}} = \frac{\sum_{i=1}^{n} X_i}{n}\]
\[SSB = \sum_{i=1}^{n} n_i (\bar{X}_i - \bar{X}_{\text{overall}})^2\]
\[SSW = \sum_{i=1}^{n} \sum_{j=1}^{n_i} (X_{ij} - \bar{X}_i)^2\]
\[MSB = \frac{SSB}{df_B}\]
\[MSW = \frac{SSW}{df_W}\]
\[F = \frac{MSB}{MSW}\]
A clinical trial is run to compare weight loss programs and participants are randomly assigned to one of the comparison programs and are counseled on the details of the assigned program. Participants follow the assigned program for 4 weeks. The outcome of interest is weight loss in pounds.
| Low Calorie | Low Fat | Low Carb |
|---|---|---|
| 5 | 3 | 6 |
| 4 | 4 | 4 |
| 5 | 2 | 1 |
\(H_0\) There is no difference between the diet programs.
\(H_A\) There is a difference between the diet programs.
1 Find our within group means
2 Find our \(\bar{X}_{\text{overall}}\)
\(\frac{4.66 + 3 + 3.66}{3} =\)
\[\bar{X}_{\text{overall}} = \frac{\sum_{i=1}^{n} X_i}{n}\]
\[SSB = \sum_{i=1}^{n} n_i (\bar{X}_i - \bar{X}_{\text{overall}})^2\]
\[SSW = \sum_{i=1}^{n} \sum_{j=1}^{n_i} (X_{ij} - \bar{X}_i)^2\]
\[MSB = \frac{SSB}{df_B}\]
\[MSW = \frac{SSW}{df_W}\]
\[F = \frac{MSB}{MSW}\]
\[ F = \frac {2.08}{2.55} = 0.81\]
Head to the F Distribution Table on p. 377
\(df_B = 2\)
\(df_W = 6\)
\[Critical \, value = 5.14\]
Image: giphy